The FCC (Fluidized Catalytic Cracking) condenser is a critical component in refinery operations. It process sweet Vacuum Gas Oil as a feed and cracks heavy oil into lighter oils like LPG and Gasoline.
Fluctuation in Reactor-Regenerator DP can lead to revarsal in catalyst circutation and hence for safety measures may leads to activation in ESD. This project applies Machine Learning (ML) techniques to detect anomalies in FCC condenser efficiency.
The dataset contains operational parameters, including temperature, pressure, flow rates, and velocities. The following preprocessing steps were performed:
EDA was conducted to visualize trends and detect potential anomalies. Key findings:
1. Principal Component Analysis (PCA)
PCA reduced high-dimensional data to two principal components.
Anomalies were detected by analyzing data points far from the normal cluster.
2. Autoencoder (Deep Learning)
A neural network trained to reconstruct normal data patterns.
Reconstruction error was used to flag high-error data points as anomalies.
Importing libraries and datasets
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
plt.rcParams['figure.figsize'] = [12, 6]
plt.rcParams.update({'font.size': 12})
df = pd.read_excel('columns.xlsx')
col = df['Symbol'].values
col = col.tolist()
df_stableFeedFlow = pd.read_csv(r'https://raw.githubusercontent.com/AshuPraja13/FCC-abnormality-detection/main/NOC_stableFeedFlow_outputs.csv',header=None)
df_stableFeedFlow.set_index = df_stableFeedFlow.iloc[:,0]
df_stableFeedFlow = df_stableFeedFlow.drop(columns=0)
df_stableFeedFlow.columns= col
df_stableFeedFlow.sample(5)
| F3 | Tatm | T1 | P4 | deltaP | P6 | Fair | T3 | T2 | Tr | ... | FLCO | FSlurry | FReflux | Tfra | T10 | T20 | V9 | V8 | V10 | V11 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1801 | 165.06 | 78.821 | 460.53 | 34.4 | -6.4 | 28 | 2.6793 | 1566.7 | 615.99 | 968.96 | ... | 1644.9 | 218.89 | 3006.8 | 314.30 | 509.93 | 628.46 | 46.668 | 49.932 | 50.217 | 47.279 |
| 634 | 164.79 | 79.947 | 461.09 | 34.4 | -6.4 | 28 | 2.6799 | 1561.6 | 616.01 | 969.05 | ... | 1645.1 | 213.78 | 2993.1 | 314.71 | 509.75 | 628.01 | 46.003 | 49.634 | 49.512 | 47.311 |
| 2183 | 165.06 | 80.040 | 460.64 | 34.4 | -6.4 | 28 | 2.6831 | 1566.0 | 616.00 | 969.03 | ... | 1647.3 | 209.26 | 3001.1 | 314.86 | 509.84 | 628.04 | 45.829 | 49.743 | 49.279 | 47.429 |
| 2610 | 165.20 | 78.028 | 460.16 | 34.4 | -6.4 | 28 | 2.6835 | 1569.8 | 615.99 | 969.01 | ... | 1644.6 | 208.76 | 2986.8 | 313.36 | 509.92 | 628.18 | 47.757 | 49.760 | 48.978 | 47.269 |
| 1803 | 165.03 | 78.832 | 460.24 | 34.4 | -6.4 | 28 | 2.6795 | 1568.2 | 615.99 | 968.97 | ... | 1644.1 | 218.69 | 3005.7 | 314.29 | 509.92 | 628.44 | 46.681 | 49.915 | 50.222 | 47.233 |
5 rows × 46 columns
df_varyingFeedFlow=pd.read_csv(r'https://raw.githubusercontent.com/AshuPraja13/FCC-abnormality-detection/main/NOC_varyingFeedFlow_outputs.csv',header=None)
df_varyingFeedFlow.set_index = df_varyingFeedFlow.iloc[:,0]
df_varyingFeedFlow = df_varyingFeedFlow.drop(columns=0)
df_varyingFeedFlow.columns= col
df_varyingFeedFlow.sample(5)
| F3 | Tatm | T1 | P4 | deltaP | P6 | Fair | T3 | T2 | Tr | ... | FLCO | FSlurry | FReflux | Tfra | T10 | T20 | V9 | V8 | V10 | V11 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1193 | 168.40 | 77.862 | 460.78 | 34.4 | -6.4 | 28 | 2.7362 | 1585.9 | 616.00 | 969.00 | ... | 1699.1 | 205.70 | 3517.3 | 322.11 | 513.91 | 633.47 | 39.600 | 56.977 | 52.924 | 50.034 |
| 5673 | 162.66 | 76.135 | 461.16 | 34.4 | -6.4 | 28 | 2.6426 | 1548.0 | 616.00 | 969.01 | ... | 1606.8 | 221.66 | 2614.5 | 305.18 | 506.66 | 623.40 | 56.834 | 44.856 | 47.081 | 45.430 |
| 7041 | 165.80 | 76.944 | 460.74 | 34.4 | -6.4 | 28 | 2.6944 | 1570.0 | 616.01 | 969.02 | ... | 1652.2 | 212.57 | 3081.2 | 314.41 | 510.76 | 629.54 | 46.994 | 51.151 | 49.849 | 47.619 |
| 5056 | 162.08 | 80.012 | 460.68 | 34.4 | -6.4 | 28 | 2.6346 | 1547.3 | 615.99 | 969.03 | ... | 1591.9 | 230.72 | 2575.0 | 307.03 | 505.85 | 622.36 | 53.811 | 43.865 | 47.206 | 44.670 |
| 4831 | 162.04 | 79.589 | 460.43 | 34.4 | -6.4 | 28 | 2.6340 | 1548.6 | 616.01 | 969.03 | ... | 1590.8 | 230.83 | 2564.0 | 306.53 | 505.78 | 622.23 | 54.441 | 43.762 | 47.107 | 44.615 |
5 rows × 46 columns
df_deltaP_increase = pd.read_csv(r'https://raw.githubusercontent.com/AshuPraja13/FCC-abnormality-detection/main/deltaP_increase_outputs.csv',header=None)
df_deltaP_increase.set_index = df_deltaP_increase.iloc[:,0]
df_deltaP_increase = df_deltaP_increase.drop(columns=0)
df_deltaP_increase.columns= col
df_deltaP_increase.sample(5)
| F3 | Tatm | T1 | P4 | deltaP | P6 | Fair | T3 | T2 | Tr | ... | FLCO | FSlurry | FReflux | Tfra | T10 | T20 | V9 | V8 | V10 | V11 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 483 | 165.15 | 79.501 | 460.62 | 34.8 | -6.8 | 28 | 2.6814 | 1566.7 | 616.00 | 968.97 | ... | 1649.6 | 211.56 | 3005.2 | 314.62 | 509.89 | 628.20 | 46.184 | 49.852 | 49.831 | 47.558 |
| 33 | 165.03 | 75.438 | 461.00 | 34.4 | -6.4 | 28 | 2.6807 | 1563.6 | 616.00 | 969.00 | ... | 1640.7 | 213.97 | 2945.9 | 310.97 | 509.80 | 628.09 | 50.679 | 49.492 | 48.761 | 47.057 |
| 724 | 165.21 | 80.059 | 460.27 | 34.8 | -6.8 | 28 | 2.6856 | 1569.2 | 616.00 | 969.04 | ... | 1647.8 | 206.93 | 3018.9 | 315.16 | 510.01 | 628.25 | 45.523 | 49.988 | 49.071 | 47.443 |
| 1153 | 164.97 | 78.208 | 460.59 | 34.8 | -6.8 | 28 | 2.6783 | 1565.8 | 616.00 | 968.97 | ... | 1639.3 | 216.10 | 2968.5 | 313.23 | 509.71 | 628.03 | 47.883 | 49.475 | 49.367 | 46.980 |
| 1245 | 165.05 | 77.363 | 460.22 | 34.8 | -6.8 | 28 | 2.6835 | 1568.6 | 616.01 | 969.04 | ... | 1647.2 | 211.48 | 2996.3 | 313.13 | 510.04 | 628.42 | 48.124 | 49.958 | 49.234 | 47.405 |
5 rows × 46 columns
EDA
df_deltaP_increase.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 1440 entries, 0 to 1439 Data columns (total 46 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 F3 1440 non-null float64 1 Tatm 1440 non-null float64 2 T1 1440 non-null float64 3 P4 1440 non-null float64 4 deltaP 1440 non-null float64 5 P6 1440 non-null int64 6 Fair 1440 non-null float64 7 T3 1440 non-null float64 8 T2 1440 non-null float64 9 Tr 1440 non-null float64 10 Treg 1440 non-null float64 11 Lsp 1440 non-null float64 12 Tcyc 1440 non-null float64 13 Tcyc - Treg 1440 non-null float64 14 Cco,sg 1440 non-null int64 15 Co2,sg 1440 non-null float64 16 P5 1440 non-null float64 17 V4 1440 non-null float64 18 V6 1440 non-null float64 19 V7 1440 non-null float64 20 V3 1440 non-null float64 21 V1 1440 non-null float64 22 V2 1440 non-null float64 23 Frgc 1440 non-null int64 24 Fsc 1440 non-null int64 25 ACAB 1440 non-null float64 26 AWGC 1440 non-null float64 27 F5 1440 non-null float64 28 F7 1440 non-null float64 29 Fsg 1440 non-null float64 30 FV11 1440 non-null int64 31 P1 1440 non-null float64 32 P2 1440 non-null float64 33 FLPG 1440 non-null float64 34 FLN 1440 non-null float64 35 FHN 1440 non-null float64 36 FLCO 1440 non-null float64 37 FSlurry 1440 non-null float64 38 FReflux 1440 non-null float64 39 Tfra 1440 non-null float64 40 T10 1440 non-null float64 41 T20 1440 non-null float64 42 V9 1440 non-null float64 43 V8 1440 non-null float64 44 V10 1440 non-null float64 45 V11 1440 non-null float64 dtypes: float64(41), int64(5) memory usage: 517.6 KB
df_deltaP_increase.describe().T
| count | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|
| F3 | 1440.0 | 165.012694 | 1.582447e-01 | 164.53000 | 164.890000 | 165.020000 | 165.140000 | 165.480000 |
| Tatm | 1440.0 | 78.342193 | 1.491834e+00 | 74.98100 | 77.191500 | 78.746000 | 79.687250 | 80.070000 |
| T1 | 1440.0 | 460.899750 | 3.939347e-01 | 459.58000 | 460.650000 | 460.910000 | 461.140000 | 462.050000 |
| P4 | 1440.0 | 34.732485 | 1.372139e-01 | 34.40000 | 34.800000 | 34.800000 | 34.800000 | 34.800000 |
| deltaP | 1440.0 | -6.732483 | 1.372164e-01 | -6.80000 | -6.800000 | -6.800000 | -6.800000 | -6.400000 |
| P6 | 1440.0 | 28.000000 | 0.000000e+00 | 28.00000 | 28.000000 | 28.000000 | 28.000000 | 28.000000 |
| Fair | 1440.0 | 2.680606 | 2.895453e-03 | 2.67220 | 2.678575 | 2.680800 | 2.682800 | 2.687600 |
| T3 | 1440.0 | 1564.149444 | 2.865294e+00 | 1555.80000 | 1562.200000 | 1564.100000 | 1566.000000 | 1573.500000 |
| T2 | 1440.0 | 616.000063 | 7.019004e-03 | 615.98000 | 616.000000 | 616.000000 | 616.000000 | 616.020000 |
| Tr | 1440.0 | 968.996868 | 3.865328e-02 | 968.82000 | 968.980000 | 969.000000 | 969.020000 | 969.250000 |
| Treg | 1440.0 | 1250.000000 | 4.875090e-02 | 1249.90000 | 1250.000000 | 1250.000000 | 1250.000000 | 1250.100000 |
| Lsp | 1440.0 | 29.807753 | 2.940493e-01 | 29.46500 | 29.651000 | 29.733000 | 29.829250 | 31.453000 |
| Tcyc | 1440.0 | 1255.280000 | 5.018382e-02 | 1255.10000 | 1255.300000 | 1255.300000 | 1255.300000 | 1255.400000 |
| Tcyc - Treg | 1440.0 | 5.280165 | 3.503803e-02 | 5.14660 | 5.256475 | 5.282450 | 5.303500 | 5.354700 |
| Cco,sg | 1440.0 | 29883.704167 | 5.020733e+01 | 29756.00000 | 29852.000000 | 29883.000000 | 29913.250000 | 30112.000000 |
| Co2,sg | 1440.0 | 0.012481 | 1.555107e-04 | 0.01191 | 0.012373 | 0.012489 | 0.012586 | 0.012819 |
| P5 | 1440.0 | 24.900000 | 6.752501e-13 | 24.90000 | 24.900000 | 24.900000 | 24.900000 | 24.900000 |
| V4 | 1440.0 | 46.261990 | 3.630540e-01 | 45.37900 | 46.008750 | 46.372500 | 46.558000 | 46.840000 |
| V6 | 1440.0 | 24.800935 | 8.767327e-02 | 24.53000 | 24.746000 | 24.809000 | 24.862250 | 25.002000 |
| V7 | 1440.0 | 54.592518 | 5.902779e-02 | 54.42100 | 54.551000 | 54.595500 | 54.637000 | 54.735000 |
| V3 | 1440.0 | 46.458697 | 2.311408e-01 | 46.29400 | 46.330000 | 46.346000 | 46.402500 | 46.995000 |
| V1 | 1440.0 | 57.939251 | 2.065041e-01 | 57.34300 | 57.801000 | 57.934000 | 58.073000 | 58.619000 |
| V2 | 1440.0 | 45.866167 | 3.305920e-01 | 45.03700 | 45.750000 | 46.026000 | 46.077000 | 46.211000 |
| Frgc | 1440.0 | 49584.829167 | 5.778035e+01 | 49414.00000 | 49544.750000 | 49589.000000 | 49628.250000 | 49742.000000 |
| Fsc | 1440.0 | 49584.416667 | 5.891687e+01 | 49396.00000 | 49544.000000 | 49588.000000 | 49627.250000 | 49831.000000 |
| ACAB | 1440.0 | 280.753229 | 1.339416e+00 | 277.20000 | 279.750000 | 281.145000 | 281.830000 | 282.770000 |
| AWGC | 1440.0 | 204.616125 | 2.498677e+00 | 198.60000 | 202.837500 | 205.390000 | 206.670000 | 208.490000 |
| F5 | 1440.0 | 1990.668125 | 7.094397e+00 | 1970.20000 | 1985.900000 | 1990.500000 | 1995.300000 | 2014.000000 |
| F7 | 1440.0 | 3737.047153 | 5.024231e+00 | 3722.70000 | 3733.600000 | 3737.200000 | 3740.700000 | 3747.800000 |
| Fsg | 1440.0 | 160.836479 | 1.739822e-01 | 160.33000 | 160.710000 | 160.850000 | 160.970000 | 161.260000 |
| FV11 | 1440.0 | 28114.860417 | 2.735492e+02 | 27445.00000 | 27925.000000 | 28199.000000 | 28337.000000 | 28547.000000 |
| P1 | 1440.0 | 14.638000 | 7.640988e-14 | 14.63800 | 14.638000 | 14.638000 | 14.638000 | 14.638000 |
| P2 | 1440.0 | 35.042184 | 2.132075e-02 | 35.00000 | 35.028000 | 35.038000 | 35.056000 | 35.105000 |
| FLPG | 1440.0 | 3037.619167 | 4.410832e+01 | 2931.30000 | 3006.075000 | 3051.150000 | 3073.900000 | 3106.600000 |
| FLN | 1440.0 | 3918.058750 | 4.528881e+01 | 3853.20000 | 3878.700000 | 3905.050000 | 3953.225000 | 4029.600000 |
| FHN | 1440.0 | 708.819583 | 3.444213e+00 | 697.97000 | 706.727500 | 708.720000 | 711.040000 | 717.330000 |
| FLCO | 1440.0 | 1643.684931 | 3.785560e+00 | 1632.90000 | 1640.800000 | 1643.700000 | 1646.500000 | 1652.200000 |
| FSlurry | 1440.0 | 213.935750 | 3.735500e+00 | 205.90000 | 211.105000 | 214.095000 | 216.560000 | 224.080000 |
| FReflux | 1440.0 | 2984.278819 | 1.873678e+01 | 2938.60000 | 2972.300000 | 2986.800000 | 2996.200000 | 3029.200000 |
| Tfra | 1440.0 | 313.544986 | 1.203643e+00 | 310.71000 | 312.657500 | 313.910000 | 314.560000 | 315.330000 |
| T10 | 1440.0 | 509.834722 | 1.324477e-01 | 509.50000 | 509.740000 | 509.835000 | 509.930000 | 510.150000 |
| T20 | 1440.0 | 628.158278 | 1.590511e-01 | 627.84000 | 628.030000 | 628.140000 | 628.280000 | 628.470000 |
| V9 | 1440.0 | 47.514651 | 1.504642e+00 | 45.38300 | 46.216000 | 47.063500 | 48.650500 | 51.247000 |
| V8 | 1440.0 | 49.686217 | 1.915637e-01 | 49.28700 | 49.533750 | 49.693000 | 49.844000 | 50.131000 |
| V10 | 1440.0 | 49.368616 | 4.827719e-01 | 47.87600 | 49.071000 | 49.350500 | 49.679000 | 50.565000 |
| V11 | 1440.0 | 47.222166 | 2.078380e-01 | 46.63000 | 47.067000 | 47.227500 | 47.379000 | 47.687000 |
sns.heatmap(df_deltaP_increase.corr(),cmap='coolwarm')
<AxesSubplot:>
for n,i in enumerate(df_stableFeedFlow.columns):
plt.figure(figsize=(12,2))
plt.plot(df_stableFeedFlow[i])
plt.xlabel('time (mins)')
plt.ylabel(i)
plt.title(df[df['Symbol']==i]['Description'].values)
plt.show()
for n,i in enumerate(df_varyingFeedFlow.columns):
plt.figure(figsize=(12,2))
plt.plot(df_varyingFeedFlow[i])
plt.xlabel('time (mins)')
plt.ylabel(i)
plt.title(df[df['Symbol']==i]['Description'].values)
plt.show()
for n,i in enumerate(df_deltaP_increase.columns):
plt.figure(figsize=(12,2))
plt.plot(df_deltaP_increase[i])
plt.xlabel('time (mins)')
plt.ylabel(i)
plt.title(df[df['Symbol']==i]['Description'].values)
plt.show()
Scaling the data with mean=0 & std = 1 using Standard Scalar.
from sklearn.preprocessing import StandardScaler
ss = StandardScaler()
X = ss.fit_transform(df_stableFeedFlow)
Applying PCA
from sklearn.decomposition import PCA
pca = PCA()
X_pca = pca.fit_transform(X)
plt.figure(figsize=(15,6))
sns.set_style('whitegrid')
sns.lineplot(x=list(range(1,47)), y=np.cumsum(pca.explained_variance_ratio_), drawstyle='steps-pre')
sns.lineplot(x=list(range(0,46)),y=np.cumsum(pca.explained_variance_ratio_))
plt.xlabel('Eigen Values')
plt.ylabel('Ratio of Variance')
plt.title('Variance by each Eigen Value')
plt.show()
It can be clearly seen than 10 dimentions can describe more than 98% data, hence redcing the feature space from 46 to 10.
pca = PCA(n_components=10)
X_pca = pca.fit_transform(X)
plt.figure(figsize=(15,6))
sns.set_style('whitegrid')
sns.lineplot(x=list(range(1,11)), y=np.cumsum(pca.explained_variance_ratio_), drawstyle='steps-pre')
sns.lineplot(x=list(range(0,10)),y=np.cumsum(pca.explained_variance_ratio_))
plt.xlabel('Eigen Values')
plt.ylabel('Ratio of Variance')
plt.title('Variance by each Eigen Value')
plt.show()
Applying Autoencoders
X_train = X.reshape(2880,46,1)
Lets create a Sequential model with Bidirectional LSTM & train the model when plant is in steady state.
To avoid overfitting of model by using 20% dropout.
# del model
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(256,return_sequences=True),input_shape=(46,1)))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(128,return_sequences=True)))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(1))
model.compile(optimizer='adam', loss='mse', metrics=['mae'])
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
bidirectional (Bidirection (None, 46, 512) 528384
al)
dropout (Dropout) (None, 46, 512) 0
bidirectional_1 (Bidirecti (None, 46, 256) 656384
onal)
dropout_1 (Dropout) (None, 46, 256) 0
dense (Dense) (None, 46, 1) 257
=================================================================
Total params: 1185025 (4.52 MB)
Trainable params: 1185025 (4.52 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
model.fit(X_train,X_train,epochs=30)
Epoch 1/30 90/90 [==============================] - 29s 249ms/step - loss: 0.1533 - mae: 0.2153 Epoch 2/30 90/90 [==============================] - 22s 248ms/step - loss: 0.0031 - mae: 0.0389 Epoch 3/30 90/90 [==============================] - 22s 247ms/step - loss: 0.0025 - mae: 0.0345 Epoch 4/30 90/90 [==============================] - 29s 318ms/step - loss: 0.0023 - mae: 0.0321 Epoch 5/30 90/90 [==============================] - 37s 415ms/step - loss: 0.0021 - mae: 0.0305 Epoch 6/30 90/90 [==============================] - 37s 411ms/step - loss: 0.0019 - mae: 0.0292 Epoch 7/30 90/90 [==============================] - 37s 414ms/step - loss: 0.0018 - mae: 0.0284 Epoch 8/30 90/90 [==============================] - 37s 414ms/step - loss: 0.0018 - mae: 0.0277 Epoch 9/30 90/90 [==============================] - 37s 413ms/step - loss: 0.0017 - mae: 0.0272 Epoch 10/30 90/90 [==============================] - 37s 413ms/step - loss: 0.0017 - mae: 0.0267 Epoch 11/30 90/90 [==============================] - 37s 413ms/step - loss: 0.0017 - mae: 0.0267 Epoch 12/30 90/90 [==============================] - 37s 415ms/step - loss: 0.0016 - mae: 0.0260 Epoch 13/30 90/90 [==============================] - 38s 419ms/step - loss: 0.0016 - mae: 0.0258 Epoch 14/30 90/90 [==============================] - 37s 416ms/step - loss: 0.0016 - mae: 0.0257 Epoch 15/30 90/90 [==============================] - 37s 416ms/step - loss: 0.0015 - mae: 0.0254 Epoch 16/30 90/90 [==============================] - 37s 416ms/step - loss: 0.0015 - mae: 0.0252 Epoch 17/30 90/90 [==============================] - 38s 418ms/step - loss: 0.0015 - mae: 0.0251 Epoch 18/30 90/90 [==============================] - 37s 417ms/step - loss: 0.0015 - mae: 0.0250 Epoch 19/30 90/90 [==============================] - 38s 420ms/step - loss: 0.0015 - mae: 0.0248 Epoch 20/30 90/90 [==============================] - 38s 426ms/step - loss: 0.0015 - mae: 0.0248 Epoch 21/30 90/90 [==============================] - 37s 412ms/step - loss: 0.0014 - mae: 0.0245 Epoch 22/30 90/90 [==============================] - 37s 416ms/step - loss: 0.0014 - mae: 0.0244 Epoch 23/30 90/90 [==============================] - 38s 421ms/step - loss: 0.0014 - mae: 0.0245 Epoch 24/30 90/90 [==============================] - 37s 415ms/step - loss: 0.0014 - mae: 0.0245 Epoch 25/30 90/90 [==============================] - 39s 439ms/step - loss: 0.0014 - mae: 0.0243 Epoch 26/30 90/90 [==============================] - 40s 448ms/step - loss: 0.0014 - mae: 0.0242 Epoch 27/30 90/90 [==============================] - 37s 413ms/step - loss: 0.0014 - mae: 0.0245 Epoch 28/30 90/90 [==============================] - 37s 410ms/step - loss: 0.0014 - mae: 0.0242 Epoch 29/30 90/90 [==============================] - 37s 411ms/step - loss: 0.0014 - mae: 0.0241 Epoch 30/30 90/90 [==============================] - 37s 414ms/step - loss: 0.0014 - mae: 0.0243
<keras.src.callbacks.History at 0x17996651e50>
Calculating the Reconstruction error using MAE.
Considering 99 percentile of error as an acceptable range, and it signifies the steady state operation.
error_ae = []
for i in range(X.shape[0]):
y_pred = model.predict(X[i].reshape(1,46,1),verbose=None)[0,:,0]
error_ae.append(np.abs(X[i]-y_pred).sum())
AE_CL = np.percentile(error_ae,99)
pd.Series(error_ae).plot()
plt.hlines(AE_CL,0,len(error_ae),colors='red',linestyles='--')
<matplotlib.collections.LineCollection at 0x1799aefc430>
Calculating the Reconstruction error using Q-test, T22-test & Cosine similarity.
Considering 99 percentile of error as an acceptable range, and it signifies the steady state operation.
X_reconstructed = np.dot(X_pca,pca.components_)
error_pca = X-X_reconstructed
Q_train = np.sum(np.abs(error_pca),axis=1)
Q_CL = np.percentile(Q_train,99)
# Q_train plot with CL
plt.figure()
plt.plot(Q_train, color='black')
plt.plot([1,len(Q_train)],[Q_CL,Q_CL], linestyle='--',color='red', linewidth=2)
plt.xlabel('Sample #')
plt.ylabel('Q metric: training data')
plt.title(f'Q metrix is max: {Q_train.max()} at:{Q_train.argmax()}mins')
plt.show()
lambda_ = np.diag(pca.explained_variance_)
lambda_inv = np.linalg.inv(lambda_)
T_train = np.zeros(X_pca.shape[0])
for i in range(X_pca.shape[0]):
T_train[i] = np.dot(np.dot(X_pca[i],lambda_inv),X_pca[i].T)
T_CL = np.percentile(T_train,99)
# T2_train plot with CL
plt.figure()
plt.plot(T_train, color='black')
plt.plot([1,len(T_train)],[T_CL,T_CL], linestyle='--',color='red', linewidth=2)
plt.xlabel('Sample #')
plt.ylabel('T$^2$ metric: training data')
plt.title(f'T$^2$ metrix is max: {np.array(T_train).max()} at:{np.array(T_train).argmax()}mins')
plt.show()
cosine = []
ed = []
for i in range(X.shape[0]):
v1 = X[i]
v2 = np.dot(X_pca,pca.components_)[i]
cosine.append(np.dot(v1,v2)/(np.linalg.norm(v1)*np.linalg.norm(v2)))
ed.append(np.linalg.norm(v1 - v2))
C_CL = np.min(cosine)
E_CL = np.percentile(ed,99)
# pd.Series(ed).plot(color='black')
# plt.plot([1,len(ed)],[E_CL,E_CL], linestyle='--',color='red', linewidth=2)
# plt.show()
pd.Series(cosine).plot(color='black')
plt.plot([1,len(cosine)],[C_CL,C_CL], linestyle='--',color='red', linewidth=2)
plt.xlabel('Sample #')
plt.ylabel('Cosine similarity metric: training data')
plt.title(f'Cosine Similarity')
plt.show()
Q_CL,T_CL,C_CL,E_CL,AE_CL
(4.123113215519074, 20.424350352568812, 0.9427550112367162, 0.9281694746755802, 0.4382286902947803)
Let's create a function for test data preprocessing and testing the data with our model.
def Q_test(X,X_pca,pca_components_,Q_CL):
X_reconstructed = np.dot(X_pca,pca_components_)
error_pca = X-X_reconstructed
Q_train = np.sum(np.abs(error_pca),axis=1)
# Q_train plot with CL
plt.figure()
plt.plot(Q_train, color='black')
plt.plot([1,len(Q_train)],[Q_CL,Q_CL], linestyle='--',color='red', linewidth=2)
plt.xlabel('Sample #')
plt.ylabel('Q metric: training data')
plt.title(f'Q metrix is max: {Q_train.max()} at:{Q_train.argmax()}mins')
plt.show()
return error_pca
def T_test(X_pca,explained_variance_,TCL):
lambda_ = np.diag(pca.explained_variance_)
lambda_inv = np.linalg.inv(lambda_)
T_train = np.zeros(X_pca.shape[0])
for i in range(X_pca.shape[0]):
T_train[i] = np.dot(np.dot(X_pca[i],lambda_inv),X_pca[i].T)
# T2_train plot with CL
plt.figure()
plt.plot(T_train, color='black')
plt.plot([1,len(T_train)],[T_CL,T_CL], linestyle='--',color='red', linewidth=2)
plt.xlabel('Sample #')
plt.ylabel('T$^2$ metric: training data')
plt.title(f'T$^2$ metrix is max: {np.array(T_train).max()} at:{np.array(T_train).argmax()}mins')
plt.show()
def cosine(X,X_transformed,pca_components_,C_CL,E_CL):
cosine = []
ed = []
for i in range(X.shape[0]):
v1 = X[i]
v2 = np.dot(X_transformed,pca_components_)[i]
cosine.append(np.dot(v1,v2)/(np.linalg.norm(v1)*np.linalg.norm(v2)))
ed.append(np.linalg.norm(v1 - v2))
# pd.Series(ed).plot(color='black')
# plt.plot([1,len(ed)],[E_CL,E_CL], linestyle='--',color='red', linewidth=2)
# plt.xlabel('Sample #')
# plt.ylabel('Eucledian Distance metric: training data')
# plt.show()
pd.Series(cosine).plot(color='black')
plt.plot([1,len(cosine)],[C_CL,C_CL], linestyle='--',color='red', linewidth=2)
plt.xlabel('Sample #')
plt.ylabel('Cosine similarity metric: training data')
plt.title(f'Cosine Similarity')
plt.show()
def autoencoder(df_test,CL):
X_test = ss.transform(df_test)
error_ae = []
error_sum = []
for i in range(X_test.shape[0]):
y_pred = model.predict(X_test[i].reshape(1,46,1),verbose=None)[0,:,0]
error_ae.append(np.abs(X_test[i]-y_pred))
error_sum.append(np.abs(X_test[i]-y_pred).sum())
error_ae=np.array(error_ae)
pd.Series(error_sum).plot(color = 'black')
plt.hlines(CL,0,len(error_ae),colors='red',linestyles='--')
plt.xlabel('Sample #')
plt.ylabel('Reconstruction error by Autoencoder')
return error_ae
X = ss.transform(df_varyingFeedFlow)
X_test = pca.transform(X)
error_pca = Q_test(X,X_test,pca.components_,Q_CL)
T_test(X_test,pca.explained_variance_,T_CL)
cosine(X,X_test,pca.components_,C_CL,E_CL)
error_ae = autoencoder(df_varyingFeedFlow,AE_CL)
X = ss.transform(df_deltaP_increase)
X_test = pca.transform(X)
error_pca = Q_test(X,X_test,pca.components_,Q_CL)
T_test(X_test,pca.explained_variance_,T_CL)
cosine(X,X_test,pca.components_,C_CL,E_CL)
error_ae = autoencoder(df_deltaP_increase,AE_CL)
Inference
During steady state operation the errors are within limit but, suddenly the error starts increasing after 200mins.
So, let’s check which parameters are deviating maximum form steady state.
Considering top 10 variables responsible for plant deviation.
Q test Error
#%% Q contribution
error = np.abs(error_pca).sum(axis=1)
cum = []
for index,value in enumerate(error):
if (value>Q_CL) and (len(cum)<15):
previous_val = value
cum.append(value)
if len(cum) == 15:
sample = index
break
else:
cum=[]
# sample = ((pd.Series(error_pca.sum(axis=1))-pd.Series(error_pca.sum(axis=1)).shift()).abs()).argmax()
print('Time-',sample,'mins')
error_test_sample = error_pca[sample]
Q_contri = np.abs(error_test_sample) # *error_test_sample # vector of contributions
Time- 146 mins
plt.figure(figsize=[15,4])
plt.bar(['variable ' + str((i+1)) for i in range(len(Q_contri))], Q_contri)
plt.xticks(rotation = 80)
plt.ylabel('Q contributions')
plt.show()
plt.figure(figsize=(15,40))
print('Time-',sample,'mins')
for i,n in enumerate(np.argsort(Q_contri)[:-11:-1]):
plt.subplot(5,2,i+1)
plt.plot(df_deltaP_increase.iloc[:,n],'blue', linewidth=1)
plt.xlabel('time (mins)')
plt.ylabel(df['Symbol'][n])
plt.title(df['Description'][n])
plt.show
Time- 146 mins
Autoecoder Error
#%% Autoencoder Error
error = np.abs(error_ae).sum(axis=1)
cum = []
for index,value in enumerate(error):
if (value>AE_CL) and (len(cum)<15):
previous_val = value
cum.append(value)
if len(cum) == 15:
sample = index
break
else:
cum=[]
# sample = ((pd.Series(error_ae.sum(axis=1))-pd.Series(error_ae.sum(axis=1)).shift()).abs()).argmax()
print('Time-',sample,'mins')
error_test_sample = error_pca[sample]
Q_contri = np.abs(error_test_sample) # *error_test_sample # vector of contributions
Time- 198 mins
plt.figure(figsize=(15,45))
print('Time-',sample,'mins')
for i,n in enumerate(np.argsort(error_ae[sample])[:-11:-1]):
plt.subplot(5,2,i+1)
plt.plot(df_deltaP_increase.iloc[:,n],'blue', linewidth=1)
plt.xlabel('time (mins)')
plt.ylabel(df['Symbol'][n])
plt.title(df['Description'][n])
plt.show
Time- 198 mins
RCSV opening has suddenly decreased, Catalyst level in reactor has suddenly increased & then reduced which indicates that CV is closing to reduce the catalyst flow form Reactor to Regenerator. This is due to increase in Reactor pressure and temperature which increases the flow of spent catalyst through SCSV. This has increased the Pressure difference between Reactor and regenerator.